In [ ]:
%%HTML
<style>
.container { width:100% }
</style>

Classifying Flowers using a Support Vector Machine

In this notebook we will use different support vector machine to classify flowers from the iris dataset using only the petal length and the petal width. We will show how to plot the decision boundary of a support vector machine.

We start with the usual imports for reading the data and plotting.


In [ ]:
import numpy             as np
import pandas            as pd
import matplotlib.pyplot as plt
import seaborn           as sns

In order order to use support vector machines we have to import the module svm from SciKit-Learn.


In [ ]:
import sklearn.svm as svm

Let us load the data and store it in a data frame.


In [ ]:
IrisDF = pd.read_csv('iris.csv')
IrisDF.head()

The function $\texttt{name_to_number}(name)$ converts the name of the flower into a number.


In [ ]:
def name_to_number(name):
    if name == 'setosa':
        return 0
    if name == 'versicolor':
        return 1
    return 2

Since we want to have a two dimensinal model, we will only use the petal length and the petal width.


In [ ]:
X = np.array(IrisDF[['petal_length', 'petal_width']])
y = np.array([name_to_number(name) for name in IrisDF['species']])

In [ ]:
X.shape, y.shape

In order to plot the decision boundary of the linear model, we define the function $\texttt{make_meshgrid}(x, y, h)$. This function gets two vectors $x$ and $y$ as inputs. The parameter $h$ is the stepsize. It returns a pair $(X, Y)$ where both $X$ and $Y$ are matrices of the same shape.


In [ ]:
def make_meshgrid(x, y, h=0.005):
    x_min, x_max = x.min() - 1, x.max() + 1
    y_min, y_max = y.min() - 1, y.max() + 1
    return np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

The function $\texttt{plot_contour}(M, X1, X2)$ is used to plot the decision boundaries for the classifier $M$. $X$ and $Y$ are meshgrids for the x and y axis.


In [ ]:
def plot_contour(M, X, Y):
    Z = M.predict(np.c_[X.ravel(), Y.ravel()])
    Z = Z.reshape(X.shape)
    plt.contour(X, Y, Z)

Given a model $M$ and a two-dimensional design matrix $X$, this function plots the data from $X$ and a decisison boundary.


In [ ]:
def plot_data_and_boundary(X, M, title):
    X0, X1 = X[:, 0], X[:, 1]
    XX, YY = make_meshgrid(X0, X1)
    plt.figure(figsize=(15, 10))
    sns.set(style='darkgrid')
    plot_contour(M, XX, YY)
    plt.scatter(X0, X1, c=y, edgecolors='k')
    plt.xlim(XX.min(), XX.max())
    plt.ylim(YY.min(), YY.max())
    plt.xlabel('Petal length')
    plt.ylabel('Petal width')
    plt.xticks()
    plt.yticks()
    plt.title(title)

We will start with a linear model. The regularization parameter is set to $10000$, which essentially means that there is no regularization.


In [ ]:
M = svm.SVC(kernel='linear', C=100000)
M.fit(X, y)
M.score(X, y)

In [ ]:
plot_data_and_boundary(X, M, 'Support Vector Machine with Linear Kernel')

The class SVC uses a One-vs-One classifier, i.e. in this example it builds three support vector machines:

  • The first SVM separates setosa from versicolor.
  • The second SVM separates setosa from virginica.
  • The third SVM separates virginica from versicolor.

Lets have a Gaussian kernel function next.


In [ ]:
M = svm.SVC(kernel='rbf', gamma=1.5, C=10000)
M.fit(X, y)
M.score(X, y)

In [ ]:
plot_data_and_boundary(X, M, 'Support Vector Machine with Gaussian Kernel')

In [ ]:
M = svm.SVC(kernel='poly', degree=3, gamma='auto', C=10000)
M.fit(X, y)
M.score(X, y)

In [ ]:
plot_data_and_boundary(X, M, 'Support Vector Machine with Polynomial Kernel of Degree 3')

Lets try to set the degree parameter to a higher value.


In [ ]:
M = svm.SVC(kernel='poly', degree=5, gamma='auto', C=10000)
M.fit(X, y)
M.score(X, y)

In [ ]:
plot_data_and_boundary(X, M, 'Support Vector Machine with Polynomial Kernel of Degree 5')

Let us use all the data


In [ ]:
X = np.array(IrisDF[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']])

In [ ]:
M = svm.SVC(kernel='linear', C=100000)
M.fit(X, y)
M.score(X, y)

In [ ]:
M = svm.SVC(kernel='rbf', gamma=1.5, C=10000)
M.fit(X, y)
M.score(X, y)

In [ ]:
M = svm.SVC(kernel='poly', degree=3, gamma='auto', C=10000)
M.fit(X, y)
M.score(X, y)

These results show the reason why support vector machines are so popular today. They often need more time for training than other methods, but in terms of accuracy they beat the simpler methods.